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The system of claim J<above, wherein the computer comprises a parallel processing 
computer comprised of a plurality of nodes, and each node executes one or more threads of the 
relational database management system to provide parallelism in the data mining operations. 



The system of clainu^ wherein the sea able data mining functions process data 
collections stored in the relational database and produce results that are stored in the relational 
database. 

Jf The system of claim^ wherein the sea able data mining functions are created by 
^1 parameterizing and instantiating the analytic API. 



The system of claim^f, wherein the scalable data mining functions are dynamically 
es comprised of combine 
parameters supplied to the analytic API. 



generated queries comprised of combined phrases wirh substituting values therein based on 



The system of claim<^f wherein the s zalable data mining functions are s elect ed from 
a group of funct ions comp rising Data Description functions, Data Derivation functions, Data 
Reduction functions, Data Reorganization functions, Data Sampling functions, and Data 
Partitioning functions. 

>/. The system of claim ^wherein the Data Description functions comprise descriptive 
statistical functions. 

^ "A ^ . . . 

Jj? The system of claim^f wherein the I|)ata Description functions are selected from a 
group comprising: 

(1) descriptive statistics for one or more numeric columns, wherein the statistics are 
selected from a group comprising copnt, minimum, maximum, mean, standard 
deviation, standard mean error, variance, coefficient of variance, skewness, kurtosis, 
uncorrected sum of squares, corrected sum of squares, and quantiles, 

(2) a count of values for a column, 

(3) a calculated modality for a column, 

(4) one or more bin numeric columns c f counts with overlay and statistics options, 

(5) one or more automatically sub-binned numeric columns giving additional covin ts and 
isolated frequently occurring individual values 

(6) a computed frequency of one or mere column values, 

(7) a computed frequency of values for pairs of columns in a column list, 
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I) 



(8) 
(9) 
(10) 
(11) 



a Pearson Product-Moment Correlation matrix, 
a Covariance matrix, 

a sum of squares and cross-products matrix, and 
a count of overlapping column values in one or more combinations of tables. 



V 4 

The system of claim^; wherein the Data Derivation functions provide column 
derivations or transformations. 

The system of claim^f wherein the Data Description functions are selected from a 



group comprising: 




a) 

(2) 
(3) 
(4) 
(5) 

(6) 

(7) 

(8) 

(9) 

(10) 

(11) 

(12) 

(13) 
(14) 
(15) 
(16) 
(17) 
(18) 
(19) 
(20) 
(21) 

(22) 



a derived binned numeric column wjherein a new column is bin number, 
a n-valued categorical column dumi4iy-coded into "n" 0/1 values, 
a n-valued categorical column recomed into n or less new values, 
one or more numeric columns scaled via range transformation, 
one or more columns scaled to a z- score that is a number of standard deviations 
from a mean, 

one or more numeric columns scaled via a sigmoidal transformation function, 
one or more numeric columns scaled via a base 10 logarithm function, 
one or more numeric columns scaled via a natural logarithm function, 
one or more numeric columns scajec 



one or more numeric columns rais 



one or more numeric columns dec ved via user defined transformation function, 



3y ranking one or more columns or expressions 

with quantile 0 to n-1 based on order and n, 
sion based on a sort expression, 



transaction summary, 
one or more variabilities derived frc m transaction summary data, 

J 

-3- 



d via an exponential function, 
d to a specified power, 



one or more new columns derived 
based on order, 

one or more new columns derived 
a cumulative sum of a value expres 

a moving average of a value expres >ion based on a width and order, 
a moving sum of a value expression based on a width and order, 
a moving difference of a value expression based on a width and order, 
a moving linear regression value derived from an expression, width, and order, 

s hip bitmap, 

a product ownership bitmap over multiple time periods, 
one or more counts, amount, perce ntage means and intensities derived from a 
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(23) one or more derived trigonometric values 



cos, arccos, esc, arccsc, sec, arcsec, tan, ar< tan, cot, and arccot, and 



(24) one or more derived hyperbolic values an( 



md their inverses, including sin, arcsin, 



their inverses, including sinh, arcsinh, 




cosh, arccosh, csch, arccsch, sech, arcsechf, tanh, arctanh, coth, and arccoth. 

^ A , . . . 

The system of claimji; wherein the Data/Reduction functions provide matrix 
building operations to reduce the amount of data required for analytic algorithms. 

"£ ^ L 

12: The system of claim-0, wherein the Data Reduction functions are selected from a 

oup comprising: 

(1) build one or more data reduction matfrices from a group comprising: (i) a Pearson- 
Product Moment Correlations matrix; (ii) a Covariances matrix; and (iii) a Sum of 
Squares and Cross Products (SSCPY matrix, 
export a resultant matrix, and 
restart a matrix operation. 



(2) 
(3) 



The system of claim^Sf wherein tme Data Reorganization functions provide an ability 
to reorganize data by joining or de-normalizing pre-processed results into a wide analytic data set. 

Jf4. The system of claim^f wherein the Data Reorganization functions are selected from 
a group comprising: 

(1) create a de-normalized new table by removing one or more key columns, and 

(2) join a plurality of tables or views into a combined result table. 

yf. The system of claim^whereii^ the Data Sampling function provides an ability to 



construct a new table containing a randomly s 



The system of claim^ where] n 
construct a new table containing at least one can< 
table or view, wherein the subsets are mutual y 



dected subset of the rows in an existing table or view. 



10? The system of claimj§f wherei i the Data Sample function selects one or more data 
samples of specified sizes from a table. 



the Data Partitioning function provides an ability to 
domly selected subset of the rows in an existing 
distinct but all-inclusive subsets of data. 
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: more 



The system of claim^ wherein the Data Partitioning function selects one or 
data partitions from a table using a database internal hashing technique. 

The system of claimyJ^wherein results of the data mining operations are stored in 
the relational databases. 

^0. The system of claim>^ wherein the relational database management system further 
comprises an analytical logical data model that stares metadata and processing results from the 
Scalable Data Mining Functions. 

A* / 

2ff. A method for performing data mining applications, comprising: 

(a) storing a relational database on one /or more data storage devices connected to 
computer; 

(b) accessing the relational database stored on the data storage devices using a relational 
database management system; and 

(c) utili2ing a comprehensive set of mrameterized analytic capabilities for performing data 
mining operations directly within a massively parallel relational database management system, the set 
of parameterized analytic capabilities including queries for execution by the relational database 
management system. 

.... 

3$. An article of manufacture comprising logic embodying a method for performing data 
mining applications, comprising: 

(a) storing a relational database or 
computer; 

(b) accessing the relational database stored on the data storage devices using a relational 
database management system; and 

(c) utilizing a comprehensive set of parameterized analytic capabilities for performing data 
mining operations directly within a massif ely parallel relational database management system, the set 
of parameterized analytic capabilities including queries for execution by the relational database 
management system. I 



one or more data storage devices connected to a 
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